DataViz Makeover 1

To improve clarity and aesthetics of the Data Visualisation plot given

Tan Yan Ru (MITB, Singapore Management University)
2022-02-13

1. Introduction

For the DataViz assignment, the following data visualisation will be given a makeover to improve its clarity and aesthetics. The dataset used is from Singapore’s Ministry of Manpower.

Final Data Visualisation

2. Evaluation of graph

2.1 Clarity

2.1.1 Generic Title and Missing Subtitle

The graph is missing a subtitle which makes it difficult to know what message the data visualisation is trying to convey. A subtitle can be used to give a description on what information is to be conveyed by the charts and a clearer title to emphasize on what the plot is about.

2.1.2 Redundant Charts

The plots for 70 and over, 70 to 74 and 75 and over have redundancy. Charts used should either use only 70 and above or 70 to 74 and 75 and over as 70 and over will include the age groups 70 to 74 and 75 and above. From the data visualisation, as there is an increasing trend in labour force participation rate from age 70 to 74, removing 70 and above and taking 70 to 74 and 75 and above would be a better option.

2.1.3 Age Group not arranged in order

The plots have been arranged to look like there is an increasing trend in the labour force participation rate. However, there is no link between each age groups and the plot is giving the wrong perception of this by plotting it from smallest to largest participation rate. It would be better to arrange the plots according to age group for viewers to easily find the age group of interest.

2.2 Aesthetics

2.2.1 Age Group Labels

The age group labels at the top x-axis for 75 & Over and 70 to Over have been cut, making it difficult to read. It would be better to rotate the labels vertically.

2.2.2 Unclear X-axis labels

The bottom X-axis of the plot only shows Year 2015 although the individual age group plots show a line trend. The x-axis does not properly show the range of years used.

2.2.3 Y-axis title unit is missing

The label for the Y-axis uses the short-form of Labour Force Participation Rate which is not intuitive for viewers to know what it indicates and the unit of measurement, in this case percentage, is missing. A better y-axis label could be ‘Labour Force Participation Rate (%)’.

3. Alternative Data Visualisation

Data Visualisation Sketch

The improved visualisation will address the generic title, missing subtitles, missing year range, cut age group labels and units for y-axis label. The colours of the area chart trends that are of interest will be a darker, more contrasting colour while the others will be changed to a lighter shade of colours. The data shown will also not include ‘70 and above’ age group as the ‘70 to 74’ and ‘75 and above’ age groups will be used instead.

The below alternative Data Visualisation addresses the issues that were brought up above and highlight some observation discovered in the data and it is the Final Data Visualisation done in Tableau.

Link to Tableau Public: Final Data Visualisation

Final Data Visualisation

4. Observations

Based on the line trend for the different age groups from 2010 to 2021, there has been a steady upward trend in the participation rate for people who are 55 and above while the younger age group has a relatively constant participation rate. This group of people are at the retirement and re-employment age. The increase could be due to Singapore’s government policies to increase retirement and re-employment age to support older people who still wish to continue working. There is also a downward trend seen for the participation rate in age group 20 to 24 which are the group of people that are either have just graduated from Polytechnic or are studying in University. Looking at the line trend for all plots, there is a shift in an older labour force as the participation rate for people below 55 is has a smaller increase in percentage points as compared to those above 55.

5. Preparation of Data and Visualisation

5.1 Data Preparation using Tableau Prep Builder

Before using the Tableau Prep Builder to clean and prepare the data, row numbers have to be added to the excel sheet to allow for easier extraction of the rows for the different Sex in the dataset.

5.1.1 Preparing of Excel Sheet

  1. Add the Row column on the left of the data in sheet, ‘mrsd_Res_LFPR_2’ as shown in the image below.
  2. Save the Excel File once done.
Add row numbers to Resident_Labour_Force_Participation_Rate.xlsx

5.1.2 Tableau Prep Builder

  1. Open Tableau Prep Builder
  2. Drop the excel file that was saved in the previous step as shown below
Drop Excel File into Tableau Prep Builder

The data preparation flow to be done is as shown below.

Data Preparation Flow

To add steps to the flow, click the + sign next to the icons as shown below.

Add Steps to Flow

Step 1: Choose fields required

  1. Double click the icon and tick the following field: Row, Age (Years)/Sex, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 and 2021
Select fields

Step 2: Filter out rows

  1. Add a new step, choose ‘Clean Step’
  2. Double click the Clean Step icon. Create the following filters by following the next step.
  3. Create filter to remove NULL values in rows. Select and right click on the null value and click on ‘Exclude’
Create Filter
  1. Do the same filtering for row ‘70 and above’. The final filters are shown below.
Filter Changes

Step 3: Create Columns to Identify rows for Total, Males and Females

  1. Add a new step, choose ‘Clean Step’
  2. Create Calculated Fields to mark data rows for Total, Females and Males
  3. Click on ‘Create Calculated Field’. Click on the ‘…’ to look for it if it is not found in the list above the tables.
  4. Follow the formula shown below and click ‘Save’
create Sex Calculated Field
  1. Create one more calculated fields to change the names Total, Males and Females to All in Age group Column and get the Difference in percentage point for 2010 and 2021. Follow the formulas below.
create Age Group Calculated Field

Step 4: Using Pivot Table

  1. A pivot table will be used to convert the Year columns into rows of data. Add the Pivot step in the flow.
  2. Double click the Pivot step.
  3. Select the Years 2010 to 2021 and drag it to the Pivoted Fields section.
Pivot Year Columns

Step 5: Extract the final fields required

  1. Add ‘Clean Step’ to the flow
  2. Remove all data rows under ‘Males’ and ‘Females’ in the Sex column. Select Males and Females value, right click and select Exclude

Remove Males and Females data rows 3. Keep the fields Year, Rate and Age Group. Select the fields as shown in the blue box. Right click in one of the selected field and select Keep Only

Keep Final Fields

Step 6: Output the file

  1. Add the Output step in flow.
  2. Browse the folder to save the file in.
  3. Input the name of the file.
  4. Run flow when done to get the output file that will be used for creating the visualisation.
Output File

5.2 Preparation of Data Visualisation

Below explains how to build the plots in Tableau.

5.2.1 Get data into Tableau

  1. Go to the folder where the Labour_Force_Part_Rate.hyper output file was created using the earlier steps done.
  2. Drag and drop the file into Tableau.

5.2.2 Building of Plot

Step 1: Pull in the fields for Columns and Rows

  1. Open a Worksheet in Tableau.
  2. Pull the corresponding fields based on the arrows shown in the image below
Pull Fields into Columns and Rows

Step 2: Edit Y-Axis Label

  1. Select the Y-Axis on the plot and right click to open the Menu.
  2. Click on ‘Edit Axis’
Edit Y-Axis
  1. Change the Axis Title.
Change Axis Title

Step 3: Add Annotations to Plot

  1. Select a point on the line. Right click to get the menu, highlight ‘Annotate’ and select ‘Point’
Add Annotation
  1. Add the text as shown below.
Annotation Text

Step 4: Exclude ‘All’ from Plot

  1. At the Legend section, right click on ‘All’ and choose ‘Exclude’
Exclude ‘All’ in Plot

Step 5: Add Titles and Subtitles to Plot

  1. Add Titles and Subtitles as shown in the Final Plot below.
Final Plot